home *** CD-ROM | disk | FTP | other *** search
- Netcache V0.21 alpha 21/2/95
-
- Thank you for downloading the Netcache utility program. This program when
- run in conjunction with the Netscape Navigator (tm) will allow offline reading
- of world wide web pages via Netscape.
-
- Any comment,suggestions for improvement and bug reports should be sent via
- private e-mail to me at
-
- nmott@cix.compulink.co.uk
-
- and I will attempt to address them.
-
- This program although written and copyright by myself is FREEWARE. Please feel
- free to distribute (or not) to whomever you wish. If you do,however, I would
- request that you make no modification to the executable or documentation
- without my consent and that it is ditributed in the original ZIP archive
- format supplied such that the code and documentation will remain together.
-
- Other than that I hope you find the program useful and enjoyable as I have
- done in producing it
-
- Thanks
- Neil Mottershead
-
- ------------------------------------------------------------------------------
-
- Features of Netcache V0.1 alpha
-
- Copies all files from netscape cache into separate directory and renames
- them to sensible(ish) dos names.
-
- Parses all html files and converts pages and image references to file
- references.
-
- Generates a report of missing images and orphaned images
-
- Generates a netscape history file of all available cached pages
-
- Allows manual addition and linking of page/image files
-
- ------------------------------------------------------------------------------
-
- Additions for Netcache V0.2
-
- All pages and images are now copied to the same directory and the parsed
- html files no longer contain path information.
-
- Additional option to delete html references to non existant images to
- avoid stalling netscape when viewing.
-
- Addtional option to delete all inline image references from the pages
-
- Addiional option to show directory of cached pages
-
- Additional option to copy pages to another path (with inline images)
-
- Addional option to force reparsing of all html files
-
- Additions for Netcache V0.21
-
- Addition of optional cache directory specifier
-
- Modification of url parser to cope with some strange urls
-
- Reduction of heap usage to allow handling of larger cache directories
-
- ------------------------------------------------------------------------------
-
- Configuring Netcache
-
-
- The netcache program requires a configuration file to tell it the location
- of the netscape directory and where to write then database of cached pages
- and images.
-
- The configuration file is called netcache.cfg and must reside in the same
- directory as netcache.
-
- You must manually create the directory that will receive the image/pages
- generated by netcache.
-
- I suggest that the netcache program/config file lives in the same
- directory as netscape and that the netcache files are placed in a subdirectory
- of the netscape directory called netcache.
-
- The config file is an ascii file with four lines
-
- <path to images>
- <path to pages>
- <path to netscape>
- <path to netscape cache>
-
- A typical config file would be
-
- c:\easynet\netscape\netcache\
- c:\easynet\netscape\netcache\
- c:\easynet\netscape\
- c:\easynet\netscape\cache\
-
- This specifies that copied pages/images go into the directory
-
- c:\easynet\netscape\netcache
-
- whilst the netscape files are located in directory
-
- c:\easynet\netscape
-
- The netscape cache is to be in directory
-
- c:\easynet\netscape\cache
-
- ------------------------------------------------------------------------------
- Running Netcache
-
- Once configured, netcache can be run by typing the following at a dos
- prompt.
-
- Netcache
-
- The program will then process all files in the netscape cache directory
- modifying them and then copying them to netcache directory
-
- To view the resultant pages fire up netscape and select
-
- File/Open File
-
- menu item or press <Ctrl> O
-
- Select the netcache directory using the mouse to display a list of
- available pages and click to view it.
-
- The page will now be displayed complete with inline images.
-
- Once the netscape cache has been processed it can be deleted with a
-
- del *.*
-
- Sometimes the netcache directory does not contain all the necessary files
- to display a web page, ie there are missing images. If this happens netscape
- will attempt to retrieve the file from the server. As netscape is offline, it
- will obviously not be able to contact the server and will appear to hang. To
- view the page click on the netscape 'Stop' button to abort the connect. See
- later for ways to overcome this problem.
-
- ------------------------------------------------------------------------------
- Netcache operation
-
- As well as renaming the netscape cache entries to meaningful dos names,
- netcache also attempts to assist Netscape to find and display inline images
- and linked pages.
-
- Netcache also builds up an index of pages and images in the page.idx and image.
- idx. For each image/page there is one line that holds the filename and the
- site / page / name of the entry. If you wish to delete any files from the
- netcache directory (ie orphan images) I strongly recommend editing the
- relevent .idx file and killing its entry by deleting the line and resaving.
-
- Within a html document an inline image or linked page is reference with an
- embedded command like the ones below
-
- <img src = http://www.easynet.co.uk/icons/easylogo.gif>
- <A HREF=pages/tourist/tour.htm>
-
- This tells netscape where to find the page/image, in this case on a world
- wide web server.
-
- When netscape loads this page, it puts it in the netscape cache with an
- internal name, and writes a reference to it to a file call FAT that also lives
- in the netscape cache directory.It than does the same thing for all the inline
- images used within that page. However, I believe that once you quit netscape
- and then start a new session, it does not attempt to use this cache info for
- reasons beyond my comprehension.
-
- Netcache recovers this cache information and builds it's own cache
- database. In order for netscape to know that the files are now on the disk the
- html files have to be amended
-
- <IMG SRC=file:/easylogo.gif>
- <A HREF=file:/tour.htm>
-
- This tells netscape that the files are located in the current directory
- and not on the web server.
-
- Sometimes netscape deletes files from its cache just prior to exiting the
- program for reasons unknown to me. It is possible to recover these images if
- you run an Undelete program on the cache directory. You will unfortunately
- have no immediate information about the nature of these files, but with a
- little ingenuity it is possible to identify them and netcache provides a
- mechanism for manually inserting them into the database.
-
- ------------------------------------------------------------------------------
- Netcache options
-
- The following options affect the conversion process from the netscape
- cache to the netcache directory
-
-
- Netcache /P
-
- This option causes any inline image references to be deleted from the
- modified page when the image is not present in the cache.
-
- Netcache /A
-
- This option causes all pages to be reparsed irrespective of whether or
- not any pages/images have been added. You will need to specify this option if
- you wish to apply the /P option AFTER you have converted the pages.
-
-
- Netcache /L
-
- This option (Lynx mode) will strip all inline graphics from converted
- pages leaving just the text and page links. It must be specified with the /P
- option to work in any meaningful way.
-
-
- The following options work on the converted image/page database in the
- netcache directory and not with the netscape cache directory files
-
- Netcache /F <filename> <url> <type>
-
- This option allows the user to manually add an image/page to the
- database. This is extremely useful in the case where an inline image has not
- been downloaded and you wish to install an image manually rather than delete
- the image reference with the /P option.
-
- Adding an image manually is a two stage process,firstly you must aquire the
- image to be installed and secondly you must copy and link it into the database.
-
- As an example, let us say that we have the htp file cafe.htm and that it is
- missing the inline image 'minicyb.gif'
-
- Image Aquisition
-
- There are three options open to us
-
- a/ Undelete option.
-
- Use an Undelete utility in the netscape cache directory and examine each
- of the undelete .MOZ files for a gif header (GIF87 or GIF89). These files can
- then be renamed to .GIF and viewed using a gif viewer. If we have a rough idea
- of the nature of the image it should be fairly straight forword to identify it
- and then to rename it to 'minicyb.gif'.
-
- b/ Web retrieval
-
- Find the full http specification of the image either by examining cafe.htm or
- by viewing the netcache .RPT report file section on missing images which in
- this case would be
-
- http://www.easynet.co.uk/icons\minicyb.gif
-
- Connect to your provider and open this as a url and netscape will retreive
- it and save it straight to disk.
-
- c/ Improvisation
-
- Take any .gif file on hand similar to the one required and copy it to
- 'minicyb. gif'. When the missing image is just a bulleting image (ie blueball.
- gif) then it is perfectly acceptable to borrow a similar image from a
- different page and reuse it.
-
- Insertion in database
-
- Now that we have got our image we need to install it in the database. As
- outlined before the usage is
-
- Netcache /f filename url image/text
-
- in our case this would become
-
- Netcache /f minicyb.gif http://www.easynet.co.uk/icons\minicyb.gif image
-
- The source .gif file must reside in the netcache program directory and
- netcache will then copy it to the netcache file directory and will reparse all
- htm files and update any links to the image. This will obviously not work if
- you have run netcache with the /P option to delete missing image references so
- be warned.
-
- Page references can similarly be added but the last command line parameter
- must be changed from 'image' to 'text' so that the reference is added to the
- correct index.
-
-
- Netcache /d match-pattern
-
- netcache will printout the name of pages held in it's netcache
- directory and has a wildcard selection mechanism
-
- Netcache /d * displays all pages
- Netcache /d / display all home pages
- Netcache /d http://www.easynet.co.uk displays easynet pages
- Netcache /d index displays the full url of all pages called index
- Netcache /d /stargate/* will list all pages with the stargate path
-
-
- Netcache /c match-pattern path
-
- netcache will copy all matching pages(complete with their inline
- images) to the specified path
-
- Netcache /d index a:\
-
- will copy the any index pages (with images) to the A drive.
-
- ------------------------------------------------------------------------------
- Netcache /H
-
- This option display the copyright banner.
-
- ------------------------------------------------------------------------------
-
- Summary of command line options
-
- Netcache Default operation
- Netcache /A /P Strip refs to missing inline images
- Netcache /A /L /P Strip refs to all inline images
- Netcache /F filename url type Manually insert page/image into database
- Netcache /D url-pattern List cached web pages
- Netcache /C url-pattern path Copy cached web pages to another drive/directry
- Netcache /H Display help/copyright banner
- Netcache /G Windows in-joke
-
- ------------------------------------------------------------------------------
-